Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks

نویسندگان

Prashanth L. A.

Abhranil Chatterjee

Shalabh Bhatnagar

چکیده

In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors, where the objective is to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to Fuemmeler and Veeravalli [2008]. However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Qlearning algorithm that operates on two timescales, while employing function approximation. Feature-based representations and function approximation is necessary to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation architecture for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder’s mobility model and this is useful in settings where the latter is not known. Our simulation results on a synthetic 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A JOINT DUTY CYCLE SCHEDULING AND ENERGY AWARE ROUTING APPROACH BASED ON EVOLUTIONARY GAME FOR WIRELESS SENSOR NETWORKS

Network throughput and energy conservation are two conflicting important performance metrics for wireless sensor networks. Since these two objectives are in conflict with each other, it is difficult to achieve them simultaneously. In this paper, a joint duty cycle scheduling and energy aware routing approach is proposed based on evolutionary game theory which is called DREG. Making a trade-off ...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

An Adaptive Congestion Alleviating Protocol for Healthcare Applications in Wireless Body Sensor Networks: Learning Automata Approach

Wireless Body Sensor Networks (WBSNs) involve a convergence of biosensors, wireless communication and networks technologies. WBSN enables real-time healthcare services to users. Wireless sensors can be used to monitor patients’ physical conditions and transfer real time vital signs to the emergency center or individual doctors. Wireless networks are subject to more packet loss and congestion. T...

متن کامل

A novel sleep/wakeup power management in wireless sensor network: A Fuzzy TOPSIS approach

The wireless sensor network (WSN) is typically comprised many tiny nodes equipped with processors, sender/receiver antenna and limited battery in which it is impossible or not economic to recharge. Meanwhile, network lifespan is one of the most critical issues because of limited and not renewal used battery in WSN. Several mechanisms have been proposed to prolong network lifespan such as LEACH,...

متن کامل

A novel sleep/wakeup power management in wireless sensor network: A Fuzzy TOPSIS approach

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Wireless Networks

دوره 20 شماره

صفحات -

تاریخ انتشار 2014

Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks

نویسندگان

چکیده

منابع مشابه

A JOINT DUTY CYCLE SCHEDULING AND ENERGY AWARE ROUTING APPROACH BASED ON EVOLUTIONARY GAME FOR WIRELESS SENSOR NETWORKS

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

An Adaptive Congestion Alleviating Protocol for Healthcare Applications in Wireless Body Sensor Networks: Learning Automata Approach

A novel sleep/wakeup power management in wireless sensor network: A Fuzzy TOPSIS approach

A novel sleep/wakeup power management in wireless sensor network: A Fuzzy TOPSIS approach

عنوان ژورنال:

اشتراک گذاری